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Abstract. We employ classical statistical methods of multivariate classification for the exploitation of the stellar 
content of the Hamburg/ESO objective prism survey (HES). In a simulation study we investigate the precision of 
a three-dimensional classification (T e g, logg, [Fe/H]) achievable in the HES for stars in the effective temperature 
range 5200 K < T e g < 6800 K, using Bayes classification. The accuracy in temperature determination is better 
than 400 K for HES spectra with S/N > 10 (typically corresponding to Bj < 16.5). The accuracies in logg and 
[Fe/H] are better than 0.68 dex in the same S/N range. These precisions allow for a very efficient selection of 
metal-poor stars in the HES. We present a minimum cost rule for compilation of complete samples of objects of a 
given class, and a rejection rule for identification of corrupted or peculiar spectra. The algorithms we present are 
being used for the identification of other interesting objects in the HES data base as well, and they are applicable 
to other existing and future large data sets, such as those to be compiled by the DIVA and GAIA missions. 

Key words. Surveys - Methods: data analysis - stars: fundamental parameters - Galaxy: halo 
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1. Introduction 

Ever since powerful computers and digital spectra have 
become available, there have been efforts to develop algo- 
rithms for automatic spectral classification (for a review 



on the early works see Kurtz 1984). The advantages of au- 
tomated procedures as compared to manual classification 
are obvious. First of all, only a few experts are able to per- 
form accurate manual classifications, and it was therefore 
sought to "freeze" this expert knowledge into computer 
programs. Such programs would allow to obtain objective 
classifications by quantitative criteria, and much larger 
data sets could be processed than by manual classifica- 
tion. The latter issue has become ever more demanding, 
with upcoming survey missions like DIVAQ, NGST0, or 
GAIA£. With all these satellites, it is planned to detect 
millions of objects, or even one billion objects in the case 
of GAIA. 



Send offprint requests to: nchristlieb@hs.uni-hamburg.de 

1 http : //www . ar i . uni-heidelberg . de/diva/ 

2 http://ngst.gsfc.nasa.gov/ 

3 http : //astro . estec . esa . nl/GAIA/ 



In the last decade, much progress was made in the field 
of automatic spectral classification, and it was demon- 
strated that computers are actually capable of perform- 
ing this task (fo r a recent, comprehensive review see 
Bailer- Jones 2001). Using Ku rtz' metric distance approach 
( Kurtz 1984 ), LaSala (1994 ) automatically classified dig- 
itized objective prism spectra from Houk's plates, with 
good results (a — 1.14 MK-types). Penprase (1994 ) used a 
similar approach, and applied it to slit spectra with similar 
spectral resolution and a slightly larger wavelength cover- 
age (see Tab. [l] for a comparison of the data used, and 
results obtained). The spectral type accuracy he reached 
for B0-F5 stars was a bit worse than that of LaSala; i.e., 
a = 1.5 MK-types. However, as we will see below, it is 
very difficult to compare the performance of classification 
algorithms based on the results published in the literature, 
because (a) rarely ever is the signal-to-noise ratio (S/N) 
of the data documented, and the achievable classification 
accuracy depends critically on S/N; (b) different wave- 
length ranges and spectral resolutions were used; and (c) 
the algorithms were applied to stars in differing ranges of 
spectral type. 
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Table 1. Comparison of automatic spectral classification performances. 



Method 


Type of spectra 


A range 


Disp. 


S/N 


Types 


°typc 




Reference 


PCA 


Slit /photoelectric 


3500-4000 A 


fO A/px 




A0-G0 


1.16 


0.85 


W83 


Metric dist. 


Slit/CCD 


3800-5f 90 A 


67 A/mm 




F8-G8 


0.4 




LS94 


Metric dist. 


Slit/CCD 


3500-5f 00 A 


f-2 A/px 




B0-F5 


1.5 




P94 


ANN 


IUE 


1150-3200 A 


2 A/px 




03-G5 


1.11 




VP95 


Metric dist. 


IUE 


f f 50-3200 A 


2 A/px 




03-G5 


1.38 




VP95 


ANN 


Slit /Reticon 


5750-8950 A 


7 A/px 




A0-A9 


0.42* 


0.15* 


WTD95 


ANN 


Slit /Reticon 


5750-8950 A 


7 A/px 




04-M6 


1.26* 


0.38* 


WTD97 


ANN+PCA 


Slit/CCD 


35f 0-6800 A 


5 A/px 




O-M 


2.34 




SGG98 


Manual 


Objective prism, widened 


3800-5f 90 A 


f 08 A/mm 


> 100? 


B2-M7 


0.6** 


0.25** 


H75-88 


Metric dist. 


Digitized objective prism 


3800-5f 90 A 


1-3 A/px 


> 100? 


B 


1.14 




LS94 


ANN 


Digitized objective prism 


3800-5f 90 A 


1-3 A/px 


> 100? 


B2-M7 


0.82*** 




BJIvH98 


ANN 


Slit/CCD 


3850-4450 A 


0.65 A/px 


> 20 


F5-K5 


0.57-0.64 




SetalOl 


Bayes 


Digitized objective prism 


3200-5300 A 


7-18 A/px 


10-30 


F2-K0 


< 1.6 


< 0.55 


This work 



References: W83=|Whitney (1983|); LS94=|LaSara (1994[ ) ; P94=|Pcnprasc (1994|); VP95=|Vieira 
WTD95HWeaver & Torres-Dodgen (1995|); WTD97HWeaver 



Houk75- 
Setal01= 



Snider et al. (2001) 



Ponz (1995|) 

Torres-Dodgen (19971) SGG98 =[5mgh et al. (1998b 



Houk (1975D, |Houk (1978; ), [Houk (1982] ) , |Houk fe Smith-Moore (1988j ); BJIvH98= |Bailcr- Jones et al. (1998| ) 



Mean absolute deviation 



According to von Hippel et al. 
* 68 % quantile 



(1994) 



The influence of the latter on the achievable classifi- 
cation accuracy is nicely demonstrated by comparing the 



results of Weaver & Torres-Dodgen (1995) with those of 



They used Principal Component Analysis (PCA) to pre- 
process their spectra, and reduce the number of input 
nodes. They obtained an accuracy of 2.34 spectral types 



Weaver & Torres-Dodgen (1997). In the former paper, the 
authors report on supervised automatic classification of 
stars of spectral type A0-A9 with a multi-layer artificial 
neural network (ANN) with one hidden layer, trained with 
a back-propagation algorithm. They reached a mean ab- 
solute deviation of 0.42 spectral types and 0.15 luminos- 
ity classes. In the second paper, the ANN was applied to 
stars in the range 04-M6, and the mean absolute devi- 
ations were only 1.26 spectral types and 0.38 luminosity 
classes. The results of Weaver & Torres-Dodgen have also 
shown that spectral classification in the near infrared can 
be done with the same accuracy as in the "classical" MK 
spectral range, with spectra of much lower resolution. The 
resolution used by Weaver & Torres-Dodgen was only 7 A 
per pixel, and their spectral range 5750-8950 A. Their re- 
sults are comparable to that achieved by others at three 
times higher spectral resolution in the optical or UV. 

To continue with our brief review, in recent years, 
ANNs have been successfully used for supervised au- 
tomatic spectral classification by a couple of groups. 
All of them used multilayer back-propagation networks 
(MBPNs). |Vieira k Ponz (1995| ) automatically classified 
spectra of 03-G5 stars obtained with the International 
Ultraviolet Explorer (IUE; dispersion 2 A per pixel) with 
an MBPN. The la error was 1.11 spectral types. They 
found their ANN classification to be superior to a classi- 
fication with a m etric distance meth od (er = 1.38 types). 
The data used by Singh et al. (1998 ) were optical (3500- 
6800 A) slit spectra with a dispersion of 5 A per pixel. 



over the full MK range (O-M). |Bailcr-Joncs et al. (1998| ) 
used again Houk's plate material, digitized with the APM 
plate scanner, yielding a wavelength range of 3800-5190 A 
and a dispersion of 1-3 A per pixel. Their best ANN con- 
figuration classified these spectra with an error distribu- 
tion having a 68% quantile of 0.82 types, and the lumi- 
nosity classification was correct for 95 % of the test sample 
spectra. 



Recently, |Snider et al. (200l| ) used a MBPN for deriva- 
tion of the stellar parameters T e g, logg and [Fe/H] from 
moderate resolution (0.65 A per pixel) spectra. Although 
their aim is to assign continuous parameter values to each 
spectrum, while we as well as the above mentioned au- 
thors carried out discrete classifications, we include their 
work in our review because Snider et al. applied their tech- 
nique to metal-poor stars, which is also the object type we 
are mainly concerned with in this paper. Snider et al. re- 
port classification accuracies of o~t cS = 135-150 K, (Jiogg = 
0.25-0.30 dex and <T[ Fc /h] = 0.15-0.20 dex. However, it ap- 
pears from the upper panel of their Fig. 4 that subgiants 
and horizontal branch stars have been excluded from the 
sample of stars they studied. A rough graphical analy- 
sis of their Fig. 4 reveals that unlike in real samples of 
stars emerging e.g. from wide-angle spectroscopic surveys, 
which do contain subgiants and horizontal-branch stars, 
their sample can be classified in logg with a similar pre- 
cision by dividing it into two classes "by hand", that is, 
assigning logg = 2.5 to all stars with T c s < 5000 K, and 
\ogg = 4.5 to all stars with T s > 5000 K. Furthermore, 
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it is questionable that there is any feature present in their 
set of spectra which does allow for a gravity classification, 
since they used continuum divided spectra. The Balmcr 
jump, which is a gravity indicator in cool stars, is therefore 
removed. In conclusion, while Snider et al. succeeded in us- 
ing ANNs for automated classification in T e g and [Fe/H], 
it remains to be demonstrated with a realistic sample that 
rectified moderate-resolution spectra indeed contain the 
information needed for a useful gravity classification. 

ANN techniques and "classical" statistical methods 
such as Bayes and minimum cost rule classifications often 
perform equally well, in terms of e.g. minimising the to- 
tal number of misclassifications. In the present work, we 
employ statistical methods, because their mathematical 
properties are well-studied, and the formulation of classi- 
fication rules in the framework of mathematical statistics 
makes them very transparent. 

Before we go into details of the methods we developed 
(Sect. |), we give a brief overview of the Hamburg/ESO 
Survey (HES) in Sect. ||, for better readibility. In Sect. || 
we investigate the classification performance for stars in 
the effective temperature range 5200 K < T c g < 6800 K 
achievable in the HES, by a simulation study. We summa- 
rize our conclusions in Sect. [s]. 

2. The Hamburg/ESO Survey 



The HES flWisotzki ct al. 1996| , |2000| ) is an objective- 
prism survey designed to select bright (12.5 > Bj > 17.5) 
quasars in the southern extragalactic sky (8 < +2.5°; 
\b\ > 30°). It is based on Ilia- J plates taken with the 
1 m ESO Schmidt telescope and its 4° prism. The plates 
were digitized at Hamburger Sternwarte. The HES spec- 
tra cover a wavelength range of 3200 A < A < 5200 A and 
have a seeing- limited spectral resolution of typically 15 A 
at H7 and 10 A at Ca II K 3934 A. This resolution makes 
it possible to also exploit the stellar content of the survey 
very efficiently. For HES example spectra see Fig. [j]. 

The goals of automatic spectral classification in the 
HES are (a) three-dimensional classification (T e g, log <?, 
[Fe/H]) of the total HES data base currently used for the 
exploitation of the stellar content, consisting of ~ 4 million 
spectra, (b) compilation of complete samples of objects of 
specific classes, and (c) identification of peculiar objects. 
These goals are similar to those emerging in DIVA and 
GAIA. Interesting classes of stars that can be found on 
HES plates include extrem ely metal-poor halo field stars 
( |Christlicb fc Beers 200C ), field horizontal br anch stars 



(Christlieb e t al., in preparation), carbon stars flChristlicb 



|et al. 2001a] ), and white dwarfs flChristlicb ct al. 2001b] ) 
A large data base with spectra of known type is also very 
useful for cross-identification with surveys in other wave- 
length ranges, such as FAUST (as has been demonstrated 
by |Brosch et al. 2000| ), the Two Micron All Sky Survey^ 
(2MASS), or the Deep Near Infrared SurveyQ (DENIS). 

4 http : //www. ipac . caltech.edu/2mass/ 
J http : //cdsweb . u-strasbg . f r/denis . html 



Furthermore, the data from these surveys can be used to 
extend the feature vectors associated with HES spectra 
(see below), and improve the automatic classification in 
the HES. 



3. Automatic spectral classification 

In order to achieve our classification aims, we need to con- 
struct a decision rule which allows us to assign a spec- 
trum with feature vector x to one of the n c classes Clj, 
j = 1 . . . n c , defined in the specific classification context. 
That is, we want to carry out a supervised classification, 
as opposed to unsupervised classification, where the aim is 
to group objects into classes not defined before the classifi- 
cation process. Methods of unsupervised classificat ion and 
their application to HES spectra are presented in Hcnnig 
|fc Christlieb (2002|) . 



3.1. Feature space 

The HES data base of digital spectra can be represented 
by feature vectors x, consisting of a set of continuous vari- 
ables X;, i.e. 



x = (xi 



(1) 



where d is the number of features used. It is critical for 
automatic classification to have a set of reliable features 
at hand. A wide range of spectral features is automati- 
cally measured in the digitized HES objective-prism spec- 
tra during the data reduction process (cf. Tab. §: stellar 
absorption and emission lines, absorption bands, contin- 
uum shape including spectral breaks, bisecting points of 
spectral density distribution. These features are measured 
in unfiltered HES spectra with the methods described in 
|Christlicb et al. (2001b| ). 

3.2. Choosing a feature combination 

It is necessary to select a subset of the available features 
for each classification problem, and each S/N level, be- 
cause of several reasons. 

(1) Blended lines, e.g. He+Ca H, can confuse the classifi- 
cation. 

(2) It is advantageous to exclude redundant features from 
the set of features used for classification, since the us- 
age of fewer features results in more stable estimates 
of the parameters of the multivariate normal distribu- 
tions (see Eq. (||) below). 

(3) The optimal feature set can vary with S/N. For in- 
stance, at low S/N it can be useful to only use con- 
tinuum shape parameters and colors for classification, 
because no stellar lines can be detected reliably any- 
more. 

The evaluation of the suitability of all 2 d — 1 possible 
combinations of d available features for a given classifica- 
tion problem is a complex task at first glance. However, it 
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5000 4500 4000 3500 5000 4500 4000 3500 

Wavelength [A] Wavelength [A] 



Fig. 1. HES example spectra, illustrating the large variety of object types that can be identified in that survey. 
Abscissae are wavelength in A, ordinates are photographic density in arbitrary units. Note that wavelength is decreasing 
from left to right. The sharp drop at ~ 5400 A is due to the Ilia- J emulsion sensitivity cutoff ("red edge"). Spectra of 
the following object types are shown: (a) DQ white dwarf (the red edge of this spectrum is disturbed by an overlapping 
spectrum); (b) cool carbon star; (c) DB white dwarf; (d) PG 1159 star (the blend of He II A4686 and [CIV] A4660 is 
marked); (e) cataclysmic variable star; (f) extremely metal-poor star (showing a very weak Ca K line); (g) FHB/A 
star; (h) cool DA white dwarf. The lower two spectra demonstrate that the strength of the Balmer jump can be used 
as an indicator for the surface gravity \ogg: While it is invisible in thelogg <~ 8 white dwarf, the Balmer jump is 
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Table 2. Automatically measured spectral features in the HES. The measurement methods of the features #1-10, 
#17-20 and #25-27 have been described in |Christlicb ct al. (2001b|) and |Christlicb ct al. (2001a|). The line indices 



KP, GP and HP were proposed by Beers ct al. (1990), and the definition of HP was later refined by Beers et al. (1999). 
We adapted these indices for the lower resolution of the HES spectra (in particular, the positions of some of the 
continuum bands were changed), and calibrated them against stars of Beers et al. present on HES plates. The la 
scatters of these calibrations are 1.22 A, 1.41 A and 1.61 A, respectively, balmsum can be used to predict HP with an 
accuracy of a = 1.55 A, and this feature is therefore superior to the directly derived HP. From the half power point 
distances dxJippl and dx_hpp2, U — B and B — V colors can be derived with accuracies of better than 0.1 mag; c\ 
can be measured in HES spectra with a precission of 0.15 mag (Christlieb et al. 2001b) 



Number Name 



Description 



Measurement method 



#1 all5160eqw W x of Mgl b triplett/TiO A 5168 

#2 all4861eqw Wx of Uf3 

#3 all4388eqw W x of Fe I A 4383+85 

#4 all4340eqw W\ of H7 

#5 all4300eqw W\ of G-Band 

#6 all4261eqw Wx of Cr I A 4254 + 75 + Fe I 4260 + 72 

#7 all4227eqw W\ of Ca I A 4227 

#8 all4102eqw W\ of RS 

#9 all3969eqw Wx of Ca H + He 

#10 all3934eqw W\ of Ca K 

#11 balmsum all4861eqw+all4340eqw+all4102eqw 

#12 CaBreak.sn S/N Calcium-break 

#13 CaBreak_cont Contrast of Calcium-break to continuum 

#14 KP Strength of Ca K 

#15 GP Strength of G band 

#16 HP Strength of H<5 

#17 C2idxl Strength of C 2 A 5165 

#18 C2idx2 Strength of C 2 A 4737 

#19 CNidx2 Strength of CN A 4216 

#20 CNidx3 Strength of CN A 3883 

#21 klcomp_l 1. continuum shape coefficient 

#22 klcomp_2 2. continuum shape coefficient 

#23 klcomp_3 3. continuum shape coefficient 

#24 klcomp_4 4. continuum shape coefficient 

#25 dxJippl Half power point distance 1 

#26 dx_hpp2 Half power point distance 2 

#27 ci Stromgren medium band color index 



Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Iterative fit procedure 
Meta-feature 
Template matching 
Template matching 
Ratio of average pixel values 
Ratio of average pixel values 
Ratio of average pixel values 
Ratio of average pixel values 
Ratio of average pixel values 
Ratio of average pixel values 
Ratio of average pixel values 
PCA 
PCA 
PCA 
PCA 

Summing of pixel values 
Summing of pixel values 
Function of summed pixel values 



is usually possible to select a set of appropriate features by 
physical considerations alone. E.g., when it is desired to 
select metal-poor stars, only those features that are pos- 
sibly useful as indicators for T e ff, logg, [Fe/H], and [C/Fe] 
need to be considered. By means of accuracy considera- 
tions and parameter studies, further features can be re- 
jected. Finally, it is also possible to reduce the dimension- 
ality of the feature space by a priori combining redundant 
features, e.g. the equivalent widths of the Balmer lines to 
a sum of equivalent widths. The remaining set of feat ures 
is then evaluated with the methods described in Sect. 4.1 



3.3. Learning sample 

For supervised classification, a learning sample is needed. 
For our purposes, we define a learning sample to be a set 
of n\ objects for which the feature vectors are known, 



and for which it is known to what class they belong. These 
classes can be defined, e.g., by grouping a set of objects ac- 
cording to their stellar parameters (e.g. T e g, logg, [Fe/H]), 
or by manually assigning classes to a set of spectra by com- 
parison with reference objects. With the help of a learning 
sample, information on the class-conditional probability 
densities p(x\£lj) can be gained. p{x\VLj)dx is the proba- 
bility to observe a feature vector in the range x . . . x + ax 
in the class Slj. We inspected the one-dimensional class- 
conditional probability distributions of the classes covered 
by the learning samples used in this work, and qualita- 
tively found their shapes to agree well with Gaussians. 
We hence model p(x\Qj) by multivariate normal distribu- 
tions, i.e., 



p(x\Qj 



■. exp 



■(*- Mi)'^ 1 (x- fij)} ,(3) 



{x} = (xi, . . .,x ni ), 



(2) 



where j denotes class number, fij the mean feature vector 
of class flj , and £j the covariance matrix of class flj . 
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3.4. Decision rules 

A central issue in automatic classification is the construc- 
tion of a decision rule which is optimal for the given classi- 
fication problem. In the HES, we use three decision rules: 
the Bayes rule, a minimum cost rule, and a rejection rule. 

3.4.1. Bayes' rule 

Classification with Bayes' rule minimizes the total num- 
ber of misclassifications, if the true distribut ion of class- 
conditional pro babilities p(x\£li) is used ( Hand 1981 ; 
Anderson 1984 ). Using Bayes' theorem, 

p(n,>(a:|fii) 



Thk- 



"£p(su)p(x\niy 

Vi 



(4) 



posterior probabilities P(Qi\x) can be calculated. A spec- 
trum of unknown class, with given feature vector x, can 
then be classified using Bayes' rule: 

Bayes' rule: Assign a spectrum with feature vector x to the 
class with the highest posterior probability P(£li\x). 

3.4.2. Minimum cost rule 

In most of the classification problems arising in the HES it 
is desired to gather a sample of objects of a specific class, 
or a specific set of classes. In these cases, Bayes' rule is not 
appropriate, because we do not want to minimize the to- 
tal number of misclassifications, but the misclassifications 
between the desired class(es) of objects, and the remain- 
ing classes. Suppose we have three classes, A-, F-, and 
G-type stars, and we want to gather a complete sample 
of A-type stars. Then only misclassifications between A- 
type stars and F- and G-type stars (and vice versa) are 
of interest. More specifically, misclassifications of A-type 
stars to F- and G-type stars (leading to incompleteness) 
are least desirable when a complete sample shall be gath- 
ered, and erroneous classification of F- and G-type stars as 
A-type stars (resulting in sample contamination) can be 
accepted at a moderate rate. Misclassifications between 
F- and G-type stars can be totally ignored, because the 
target object type is not involved. 

Classification aims like this can be realized by using a 
minimum cost rule. Cost factors r^k, with 

< r hk < 1; h=l,...,n c ; k=l,...,n c . (5) 



allow t o aaaign relative weights to individual typca of mia — 
classifi cation!]. The coat factor r^k the relative weight of 

a misclassification from class to class fife. 

Suppose we have an object of unknown class, with fea- 
ture vector x. We ask how large the cost is if it belongs to 
class Sl/j, and would be assigned to class fife, h ^ k. The 
cost Ch^k{x) is: 

C h ^ k (x) = r hk P{Vl h \x) 

p(si h ) P (x\n h ) 



Thk- 



ahPhjx) 

m 

Y, a,iPi{x) 



(6) 



In the last step we have used the abbreviations P(flh) = 
ah and p(x\ilh) — Ph(x). We do not know to which of 
the possible classes O/j, h = 1, . . . , n c , the object actually 
belongs. Therefore, we estimate the expected cost Ck(x) 
for assigning an object with feature vector x to the class 
VLk by computing the following sum of costs: 

m 

C k (x) = Y, C h^k{x) 

h=i 



fib / \ 

^ a h p h {x) 
, r h .k - 



h=i 



Era / \ 

i=i a l p l {x) 



(7) 



Now we can formulate the minimum cost rule, which min- 
imizes the total cost (Hand 1981). 



Minimum Cost Rule: Assign an object with feature vector 
x to the class £lk with the lowest expected cost Ck(x). 

If the cost factors are chosen such that rhk = fihk, the 
minimum cost rule classification is identical to classifica- 
tion using Bayes' rule. In this case the cost for assigning 
the class Qk to a spectrum with feature vector x is the 
probability that the object belongs to one of the other 
classes h ^ k. This follows immediately from Eq. (Q). If 
fhk 7^ b~hk, the total number of misclassifications is not 
minimized, so that the quality of a minimum cost rule 
classification has to be evaluated by other criteria. 

For any given classification aim, one can divide the 
cost factors to be chosen into three sets: 

t2o: Cost factor for misclassification of an object of the 
target class ('t') to ('2') one of the other classes ('o'). 

o2t : Cost factor for contamination of the target class. 

o2o : Cost factor for misclassification between other 
classes. 

Since sample completeness and contamination are interde- 
pendent, in practice only the relative value t2o/o2t has to 
be adjusted. For this purpose, the classification results as 
a function of t2o/o2t are evaluated. The expected error 
rates, estimated e.g. with the "leaving one out" method 
(see Sect. 4.1 below), tell which level of completeness and 



sample contamination will be achieved. Christlieb et al 



(1998) presented a software tool for a convenient choice of 



i=l 



cost factors. 



3.4.3. Rejection rule 

Non-mathematically speaking, Bayes' rule assigns the 
class with the highest relative resemblance to each spec- 
trum to be classified. However, it is ignorant of the abso- 
lute resemblance: A spectrum with feature vector x may 
be assigned to a class with very low posterior probabil- 
ity p(£li\x), if p(ili\x) is even lower for all other classes. 
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This means that a class is assigned to all spectra, even to 
"garbage spectra" which are disturbed, for instance, by 
plate artifacts. Therefore, it is useful to apply a rejection 
rule in addition to either the Bayes rule or the minimum 
cost rule. The rejection rule can also be used "stand alone" 
for the identification of peculiar objects, e.g., quasars. 

Rejection rule: Reject an object from classification to 
class Qi, if A(Q,i]x) > (3 . 

The parameter (3 is a threshold to be chosen, and 
the parameter A is the atypicality index suggested by 
Aitchison et al. (19771 ), 



jf;^ 35- /*«')' s < ^ -/*<)}> 



(8) 



where T(a;x) is the incomplete gamma function and d 
the number of features used for classification. Use of the 
above rejection criterion is identical to performing a \ 2 
test of the null hypothesis Hq that an object with feature 
vector x belongs to class fii at significance level 1 — /3, 
against the alternative hypothesis Hi that it does belong 
to class fii. We reject the null hypothesis if its significance 
level is low, i.e., if it is very unlikely that a feature vector 
x is observed for class f2,, given the multivariate normal 
distributions (J3J) are the true distributions of the class- 
conditional probabilities p(x\fli). 

4. Classification performance 

In a first application of automatic spectral classification in 
the HES, we sel ected candidates for ex tremely metal-poor 
halo f ield stars. Christlieb (2000| ) and Christlieb fc Beers 



(200C ) have shown that with this method, a very efficient 



selection of metal-poor stars is feasible. 80 % of an investi- 
gated sample of 56 highest priority metal-poor candidates 
were shown by medium-resolution follow-up spectroscopy 
to have metallicities below — 2.0dex, and results based on 
a larger sample of stars, including also fainter and lower 
priority candidates, indicate that the overall efficiency for 
the selection of stars with [Fe/H] < —2.0 is ~ 60% in 
the HES (Christlieb et al., in preparation). This is the 
most efficient selection of metal-poor stars ever obtained 
in a wide-angle survey for such stars. In this paper we 
focus on results of a systematic investigation of the classi- 
fication performance for stars in the effective temperature 
range 5200 K < T off < 6800 K achievable in the HES, by 
means of a simulation study. 

4.1. Evaluation of classification rules 

Classification rules can be evaluated by the number of 
expected misclassifications (in the case of Bayes' rule), or 
by the total expected cost (in the case of the minimum 
cost rule) . The three most important methods to estimate 
these numbers are (Deichsel & Trampisch 1985): 



(2) "Hold out" method 

(3) "Leaving one out" method. 

Re-substitution means that one uses the learning sam- 
ple also as test sample. The drawback of this method is 
that one underestimates the number of expected misclas- 
sifications, because a classification rule derived with the 
help of a finite learning sample is always adapted to the 
individual composition of the learning sample. Therefore, 
the estimation of the expected number of misclassifica- 



tions is biased (Deichsel & Trampisch 1985) 



An improvement in this respect is gained when the 
"hold out" method is used. Here one randomly divides 
the learning sample disjointly into a new, smaller learn- 
ing sample, and a test sample. Since the learning sample 
and test sample are completely independent in this case, 
an unbiased estimate of the expected error rates is possi- 
ble ( Deichsel fc Trampisch 1985| ). However, the drawback 
is that one needs a large enough learning sample. When 
modeling the class-conditional probabilities with multi- 
variate normal distributions, the learning sample size has 
to be large enough to ensure a robust estimation of the 
parameters of the distributions. 

The problem of learning sample size can be circum- 
vented by using the "leaving one out" method. Suppose 
we have a learning sample of size n;. We exclude object i 
from the learning sample, and construct the classification 
rule using the rii — 1 remaining objects. Object i is then 
classified with this classification rule. This procedure is re- 
peated n; times, so that each object of the learning sample 
is excluded once, and used as test sample. By adding up 
the numbers of misclassifications obtained in each step, 
one gets an unbiased estimate of the expected error rate 



(Deichsel & Trampisch 1985). The only drawback of this 
method is that it consumes a lot more computing time 
than the previously mentioned methods, since ni classi- 
fication rules have to be constructed. However, the com- 
puting time increases only linearly with learning sample 
size ni , so that the usage of the "leaving one out" method 
was feasible for all HES learning samples used so far (the 
largest learning sample used had m — 165 000). 



4.2. Simulation study on the classification performance 
in the HES 

For our simulation study, we employed a grid of model 
spectra converted to objective prism spectra with the 



(1) Re-substitution 



methods described in Christlieb et al. (2001b). The grid 
covers the following stellar parameter range: 

T cS = 5200(200)6800 K 
\ogg = 2.2(0.8)4.6 
[Fe/H] = -0.3, -0.9, -1.5(0.3) -3.6 

The values in brackets refer to grid point distances. The 
grid defines 360 classes. Since it is one of the aims of our 
simulation study to investigate how the classification accu- 
racy changes with S/N, we need to simulate spectra of dif- 
ferent S/N. For this, we added Gaussian noise to the grid 
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Fig. 2. Distribution of the three features used for classification of main sequence turnoff stars in one learning sample 
class, and for different signal-to-noise ratios. 



of simulated spectra so that spectra with S/N — 5(5)30 re- 
sulted. The adequatness of this noise model for HES spec- 
tra has been demonstrated by Christlieb et al. (2001b). It 
is necessary to produce learning samples for different S/N 
levels because the width of the class-conditional probabil- 
ity distributions change with S/N (see Fig. |[). We then 
performed a Bayes classification, using the three features 
ci, balmsum (the sum of the equivalent widths of H/3, H7 
and H<5), and the Ca K index KP. This feature set was 
found to be best suitable for the desired three-dimensional 
classification in parameter studies, and by systematic eval- 
uation of the classification performance of different feature 
combinations. 

For each spectral class 500 simulated spectra were com- 
puted, which is a large enough number to randomly sub- 
divide the grid into a learning sample and an independent 
test sample. To obtain a realistic estimate of the classi- 
fication performance, two effects have to be taken into 
account: 



Under sampling of the error distribution: In our simula- 



Discretization error: Real samples of stars have a contin- 
uous distribution of stellar parameters. These parame- 
ters will be mapped to our discrete grid. This results in 
classification errors for stars having stellar parameters 
lying between two grid points. 

We have taken these effects into account by applying (up- 
ward) corrections to the classification errors measured on 
our model spectra grid (see Fig. ||). For the estimation 
of the corrections for error distribution undersampling, 
Gaussian random errors were added to the grid point pa- 
rameters to simulate classification errors, and the mea- 
sured errors were compared with the errors known from 
the chosen a of the Gaussian distribution. The discretiza- 
tion error correction to be applied was derived by mapping 
continuously distributed stellar parameters to our grid, 
and computing the mean difference between real parame- 
ters and the parameters detected by the grid. 

In the stellar parameter range we explored so far, the 
corrected accuracy in effective temperature classification 
is better than 400 K for spectra with S/N > 10, which 
typica lly corresponds to Bj < 16.5 (see |Christlieb et al 



tioi . study, we use a grid of stellar parameters. 2001b ). The accuracies in log g and [Fe/H] are better than 



Therefore, if a classification error is smaller than half 
of the grid point distance, an error of zero is measured. 
Therefore, the classification error is systematically un- 
derestimated in these cases. 



0.68 dex for the same magnitude range. Note that the ac- 
curacy in [Fe/H] strongly depends on [Fe/H] itself, since 
the Ca K line, used as metallicity indicator, is not de- 
tectable in the spectra of the lowest metallicity turnoff 
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Fig. 3. Corrections applied to the classification errors 
measured in the simulation study. For explanation see text 



stars at the spectral resolution of the HES. Therefore, a 
metallicity classification is not possible in that part of the 
stellar parameter space, resulting in a larger average clas- 
sification error. 

As a plausibility check we compared our results with 
the classification accuracies we would expect from sim- 
ple, one-dimensional parameterization approaches using 
B — V, c\ and KP as temperature, gravity and metal- 
licity indicators, respectively. In the effective tempera- 
ture range 5200 K < T cS < 6800 K, A(B — V) / AT cff - 



0.028 mag/f 00 K (Lang 1992). The average accuracy of 
the HES B — V calibration in the temperature range un- 
der consideration, averaged over the full magnitude range 
covered by the HES, is (Jb-v = 0.07 mag ( phristlieb et al 
2001 b[jj so that an average temperature classification ac- 
curacy of (TT eff = 260 K is expected. This is consistent with 
classification errors of 200-420 K in the magnitude range 
14.0 < Bj < 17.5. Aci/Alogg - 0.1-0.3 mag/d ex in 



the effective temperature range under consideration ( Lang 
1992£]Thc average accuracy of the HES C\ calibration is 
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Fig. 4. Classification precision for stars in the effective 
temperature range 5200 K < T cff < 6800 K in the HES as 
a function of S/N, as obtained with Bayes classification 
in our simulation study 



a gravity classification precision of <7i ogg = 0.5-1.5 dex. 
This is consistent with (Tw, < 0.68 measured in our sim- 



a ci = 0.15 mag (Christlieb et al. 2001b), so that we expect 



ulation. Finally, from Fig. 4 of Beers et al. (1990) one can 
read that at B — V = 0.5, the difference in the Ca K 
index KP between a star of [Fe/H] = —2.0 and a star of 
[Fe/H] = -3.0 is 2.7A and 4.7 A for dwarfs and giants, 
respectively. Considering the fact that Ckp = 1.22 A in the 
HES, it is not surprising that classification precisions as 
high as 0.4 dex can be achieved for the brightest stars in 
the HES (see Fig. §. 



5. Discussion and conclusions 

We have demonstrated that automatic spectral classifica- 
tion of turnoff stars, using "classical" statistical methods, 
is feasible in the HES with high accuracy. Our results sug- 
gest that it might be possible to determine the metallicity 
distribution function (MDF) of the galactic halo directly 
from a large sample of HES spectra. The MDF is an impor- 
tant constraint for models of galactic chemical evolution 
(see, e.g., |lkuta fc Arimoto 1 999| ; pey 2000[ ). 

The described methods are currently being applied to 
the large HES data base of digital spectra, in order to 
select interesting stellar objects in an automated fashion, 
and fully exploit the large scientific potential present in 
our data base. 
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Our algorithms can easily be adapted for automatic 
classification of other large data sets, e.g. those to be com- 
piled by the DIVA and GAIA missions. 
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